AITopics | anomalous data

Collaborating Authors

anomalous data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection

Neural Information Processing SystemsMar-21-2026, 18:08:31 GMT

In high-stakes sectors such as network security, IoT security, accurately distinguishing between normal and anomalous data is critical due to the significant implications for operational success and safety in decision-making. The complexity is exacerbated by the presence of unlabeled data and the opaque nature of black-box anomaly detection models, which obscure the rationale behind their predictions. In this paper, we present a novel method to interpret the decision-making processes of these models, which are essential for detecting malicious activities without labeled attack data. We put forward the Segmentation Clustering Decision Tree (SCD-Tree), designed to dissect and understand the structure of normal data distributions.

artificial intelligence, data mining, machine learning, (6 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.59)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.75)

Add feedback

Quantum Autoencoders for Anomaly Detection in Cybersecurity

Senthil, Rohan, Wong, Swee Liang

arXiv.org Artificial IntelligenceOct-28-2025

Anomaly detection in cybersecurity is a challenging task, where normal events far outnumber anomalous ones with new anomalies occurring frequently. Classical autoencoders have been used for anomaly detection, but struggles in data-limited settings which quantum counterparts can potentially overcome. In this work, we apply Quantum Autoencoders (QAEs) for anomaly detection in cybersecurity, specifically on the BPF-extended tracking honeypot (BETH) dataset. QAEs are evaluated across multiple encoding techniques, ansatz types, repetitions, and feature selection strategies. Our results demonstrate that an 8-feature QAE using Dense-Angle encoding with a RealAmplitude ansatz can outperform Classical Autoencoders (CAEs), even when trained on substantially fewer samples. The effects of quantum encoding and feature selection for developing quantum models are demonstrated and discussed. In a data-limited setting, the best performing QAE model has a F1 score of 0.87, better than that of CAE (0.77). These findings suggest that QAEs may offer practical advantages for anomaly detection in data-limited scenarios.

data mining, detection, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.21837

Country: Asia > Singapore (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.92)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Cyber-Resilient System Identification for Power Grid through Bayesian Integration

Li, Shimiao, Qu, Guannan, Hooi, Bryan, Sekar, Vyas, Kar, Soummya, Pileggi, Larry

arXiv.org Artificial IntelligenceOct-17-2025

Power grids increasingly need real-time situational awareness under the ever-evolving cyberthreat landscape. Advances in snapshot-based system identification approaches have enabled accurately estimating states and topology from a snapshot of measurement data, under random bad data and topology errors. However, modern interactive, targeted false data can stay undetectable to these methods, and significantly compromise estimation accuracy. This work advances system identification that combines snapshot-based method with time-series model via Bayesian Integration, to advance cyber resiliency against both random and targeted false data. Using a distance-based time-series model, this work can leverage historical data of different distributions induced by changes in grid topology and other settings. The normal system behavior captured from historical data is integrated into system identification through a Bayesian treatment, to make solutions robust to targeted false data. We experiment on mixed random anomalies (bad data, topology error) and targeted false data injection attack (FDIA) to demonstrate our method's 1) cyber resilience: achieving over 70% reduction in estimation error under FDIA; 2) anomalous data identification: being able to alarm and locate anomalous data; 3) almost linear scalability: achieving comparable speed with the snapshot-based baseline, both taking <1min per time tick on the large 2,383-bus system using a laptop CPU.

data mining, data quality, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.14043

Country: Europe > Middle East > Cyprus (0.16)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
(2 more...)

Add feedback

Low-Communication Resilient Distributed Estimation Algorithm Based on Memory Mechanism

Li, Wei, Hu, Limei, Chen, Feng, Yao, Ye

arXiv.org Artificial IntelligenceAug-6-2025

In multi-task adversarial networks, the accurate estimation of unknown parameters in a distributed algorithm is hindered by attacked nodes or links. To tackle this challenge, this brief proposes a low-communication resilient distributed estimation algorithm. First, a node selection strategy based on reputation is introduced that allows nodes to communicate with more reliable subset of neighbors. Subsequently, to discern trustworthy intermediate estimates, the Weighted Support Vector Data Description (W-SVDD) model is employed to train the memory data. This trained model contributes to reinforce the resilience of the distributed estimation process against the impact of attacked nodes or links. Additionally, an event-triggered mechanism is introduced to minimize ineffective updates to the W-SVDD model, and a suitable threshold is derived based on assumptions. The convergence of the algorithm is analyzed. Finally, simulation results demonstrate that the proposed algorithm achieves superior performance with less communication cost compared to other algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.02705

Country: Asia > China (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

A Mathematical Optimization Approach to Multisphere Support Vector Data Description

Blanco, Víctor, Espejo, Inmaculada, Páez, Raúl, Rodríguez-Chía, Antonio M.

arXiv.org Artificial IntelligenceJul-16-2025

We present a novel mathematical optimization framework for outlier detection in multimodal datasets, extending Support Vector Data Description approaches. We provide a primal formulation, in the shape of a Mixed Integer Second Order Cone model, that constructs Euclidean hyperspheres to identify anomalous observations. Building on this, we develop a dual model that enables the application of the kernel trick, thus allowing for the detection of outliers within complex, non-linear data structures. An extensive computational study demonstrates the effectiveness of our exact method, showing clear advantages over existing heuristic techniques in terms of accuracy and robustness.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2507.11106

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection

Neural Information Processing SystemsMay-27-2025, 10:02:35 GMT

dissect black box, rule-based explanation, unsupervised anomaly detection, (4 more...)

Neural Information Processing Systems

Industry:

Transportation > Air (0.64)
Information Technology > Security & Privacy (0.61)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Post-Hoc Calibrated Anomaly Detection

Gloumeau, Sean

arXiv.org Artificial IntelligenceMar-25-2025

Deep unsupervised anomaly detection has seen improvements in a supervised binary classification paradigm in which auxiliary external data is included in the training set as anomalous data in a process referred to as outlier exposure, which opens the possibility of exploring the efficacy of post-hoc calibration for anomaly detection and localization. Post-hoc Platt scaling and Beta calibration are found to improve results with gradient-based input perturbation, as well as post-hoc training with a strictly proper loss of a base model initially trained on an unsupervised loss. Post-hoc calibration is also found at times to be more effective using random synthesized spectral data as labeled anomalous data in the calibration set, suggesting that outlier exposure is superior only for initial training.

calibration, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.19577

Country:

Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models

Do, Nguyen, Nguyen, Truc, Hassanaly, Malik, Alharbi, Raed, Seo, Jung Taek, Thai, My T.

arXiv.org Machine LearningMar-8-2025

Despite a plethora of anomaly detection models developed over the years, their ability to generalize to unseen anomalies remains an issue, particularly in critical systems. This paper aims to address this challenge by introducing Swift Hydra, a new framework for training an anomaly detection method based on generative AI and reinforcement learning (RL). Through featuring an RL policy that operates on the latent variables of a generative model, the framework synthesizes novel and diverse anomaly samples that are capable of bypassing a detection model. These generated synthetic samples are, in turn, used to augment the detection model, further improving its ability to handle challenging anomalies. Swift Hydra also incorporates Mamba models structured as a Mixture of Experts (MoE) to enable scalable adaptation of the number of Mamba experts based on data complexity, effectively capturing diverse feature distributions without increasing the model's inference time. Empirical evaluations on ADBench benchmark demonstrate that Swift Hydra outperforms other state-of-the-art anomaly detection models while maintaining a relatively short inference time. From these results, our research highlights a new and auspicious paradigm of integrating RL and generative AI for advancing anomaly detection.

avg 0, macro avg 0, weighted avg 0, (14 more...)

arXiv.org Machine Learning

2503.06413

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > South Korea (0.04)
Asia > Middle East > Saudi Arabia (0.04)

Genre: Research Report (0.64)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area (0.46)
Government > Regional Government > North America Government > United States Government (0.46)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

A Novel Spatiotemporal Correlation Anomaly Detection Method Based on Time-Frequency-Domain Feature Fusion and a Dynamic Graph Neural Network in Wireless Sensor Network

Ye, Miao, Jiang, Zhibang, Xue, Xingsi, Li, Xingwang, Wen, Peng, Wang, Yong

arXiv.org Artificial IntelligenceFeb-24-2025

Attention-based transformers have played an important role in wireless sensor network (WSN) timing anomaly detection due to their ability to capture long-term dependencies. However, there are several issues that must be addressed, such as the fact that their ability to capture long-term dependencies is not completely reliable, their computational complexity levels are high, and the spatiotemporal features of WSN timing data are not sufficiently extracted for detecting the correlation anomalies of multinode WSN timing data. To address these limitations, this paper proposes a WSN anomaly detection method that integrates frequency-domain features with dynamic graph neural networks (GNN) under a designed self-encoder reconstruction framework. First, the discrete wavelet transform effectively decomposes trend and seasonal components of time series to solve the poor long-term reliability of transformers. Second, a frequency-domain attention mechanism is designed to make full use of the difference between the amplitude distributions of normal data and anomalous data in this domain. Finally, a multimodal fusion-based dynamic graph convolutional network (MFDGCN) is designed by combining an attention mechanism and a graph convolutional network (GCN) to adaptively extract spatial correlation features. A series of experiments conducted on public datasets and their results demonstrate that the anomaly detection method designed in this paper exhibits superior precision and recall than the existing methods do, with an F1 score of 93.5%, representing an improvement of 2.9% over that of the existing models.

anomalous data, anomaly detection, attention mechanism, (12 more...)

arXiv.org Artificial Intelligence

2503.00036

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Incremental Gaussian Mixture Clustering for Data Streams

Bhanderi, Aniket, Bhatnagar, Raj

arXiv.org Artificial IntelligenceDec-10-2024

The problem of analyzing data streams of very large volumes is important and is very desirable for many application domains. In this paper we present and demonstrate effective working of an algorithm to find clusters and anomalous data points in a streaming datasets. Entropy minimization is used as a criterion for defining and updating clusters formed from a streaming dataset. As the clusters are formed we also identify anomalous datapoints that show up far away from all known clusters. With a number of 2-D datasets we demonstrate the effectiveness of discovering the clusters and also identifying anomalous data points.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.07217

Country:

North America > United States > Maryland > Montgomery County > Bethesda (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback